Back

The Plant Genome

Wiley

Preprints posted in the last 30 days, ranked by how well they match The Plant Genome's content profile, based on 53 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Genomic prediction of single cross families of perennial ryegrass in two nitrogen managements

Santos Junior, D. R. d.; Fe, D.; Lenk, I.; Jensen, C. S.; Asp, T.; Janss, L.; Bornhofen, E.

2026-05-08 genomics 10.64898/2026.05.05.722839 medRxiv
Top 0.1%
31.9%
Show abstract

The performance of a single cross is determined by the average additive effects of the parents, as well as the interactions between them. These quantities can be estimated using an appropriate genetic design, allowing for the estimation of general (GCA) and specific (SCA) combining abilities. The prediction of GCA for new parents and the total genetic value of unrealized crosses can be made when genome-wide marker information is available. Several studies in crops such as maize and rice have demonstrated the potential of genomic-assisted prediction of single-cross performance in economically important crops. However, no study to date has explored its relevance in perennial ryegrass, an obligate allogamous species that is bred in genetically heterogeneous families. In this study, we aimed to estimate genetic parameters and assess the ability of genomic models to predict the performance of F2 families in terms of dry matter yield and nutritive quality traits. We used data from a large partial diallel involving 104 parents from two distinct subpopulations, as inferred by admixture analysis. F2 families were evaluated in multiple environments and under two nitrogen availability conditions. Genotyping-by-sequencing of the parent plants produced 42,145 variants after quality control, which were used to estimate genomic relationships based on identity-by-state. Variance component estimation revealed limited GCA and SCA interactions with the environment, and particularly with nitrogen management. The predictive abilities of two parental models exceeded 0.60 and often surpassed 0.70 for most traits. However, incorporating non-additive effects into the model did not improve predictive ability. We leveraged the genetic diversity among parents to map genomic regions associated with all recorded traits. Genome-wide association studies (GWAS) by genomic best linear unbiased prediction (GBLUP) identified six quantitative trait loci (QTL) regions, with 45 candidate genes within the linkage disequilibrium range, estimated at approximately 92 kb. Our results demonstrate that genomic prediction of single crosses can be performed with high accuracy, especially when both parents are also progenitors of families in the training set.

2
Reaction Norm Modeling of High-Dimensional Genomic and Environmental Data Improves Prediction Accuracy in Winter Wheat

Acharya, S. R.; Garcia-Abadillo, J.; Lyerly, J.; Brown-Guedira, G.; Jarquin, D.; Bandillo, N.

2026-05-08 genetics 10.64898/2026.05.05.722758 medRxiv
Top 0.1%
18.8%
Show abstract

Genomic prediction models that account genotype-by-environment (GxE) have the potential to accelerate the rate of genetic gain for yield and agronomic performance, yet relatively few studies have applied GxE prediction in public soft red winter wheat (Triticum aestivum) breeding programs. In this study, we extended a reaction norm-based genomic prediction framework by integrating weather-based environmental covariates to more effectively capture genotype- environment interactions. Key agronomic traits, including seed yield, plant height, test weight, and heading date, were evaluated across 33 environments (location-year) using over 3,200 breeding lines from the North Carolina State University small grains breeding program. Multiple genomic prediction models were compared using several cross-validation (CV) schemes representing common breeding scenarios. Across traits, the reaction norm M5 model, which incorporates both GxE and genotype-by-environmental covariate interactions (GxO), achieved the highest prediction accuracy (PA) in CV2 (predicting incomplete field trials) and CV1 for yield and test weight (predicting new lines). The highest PA was observed for test weight under CV2 (0.54) and for yield under CV1 (0.41). Under CV0 (predicting new environments), the M3 model incorporating GxE produced highest PA across traits, with the greatest accuracy for plant height (0.45), although differences among M2, M3, and M4 were small. Prediction under CV00 (predicting new lines in new environments) remained more challenging, with PA values 0.10 - 0.20 across traits. Overall, our results demonstrate that integrating environmental covariates into genomic prediction models can improve predictive performance across diverse wheat-growing environments in North Carolina, supporting their utility for applied breeding efforts. CORE IDEASO_LIIntegrating genotype-by-environment (GxE) interactions with environmental covariates improves prediction accuracy across environments. C_LIO_LIModel performance varies by prediction scenario, with different approaches performing best for new lines, incomplete trials, or new environments. C_LIO_LIPrediction of new lines in new environments remains challenging. C_LI PLAIN LANGUAGE SUMMARYThis study explores how adding environmental information to genomic prediction models can improve prediction accuracy in a public winter wheat breeding program. Using data from multi-environment trials conducted across diverse conditions in North Carolina, we evaluated statistical models that capture how different wheat lines respond to changing environments. By incorporating weather data, we improved the ability to predict performance across locations and years. These findings provide practical insights for refining selection strategies and accelerating genetic gain in wheat breeding.

3
Temporal changes in allele frequency facilitate detection of adaptive variants in winter wheat (Triticum aestivum L.) breeding programs

Johansen, N. H.; Sarup, P.; Hansen, P.; Orabi, J.; Jahoor, A.; Ramstein, G. P.

2026-05-04 genetics 10.64898/2026.04.30.721918 medRxiv
Top 0.1%
14.9%
Show abstract

In quantitative genetics, candidate SNPs are identified through genotype-phenotype associations inferred with genome-wide association studies (GWAS). In this study, we explore an alternative approach to detect genetic variants with non-neutral effects by tracking temporal trends in allele frequency in a winter wheat (Triticum aestivum L.) breeding population over an eight-year period, from which signals of selection may be inferred. Selection signatures were inferred with a generalized linear model, where we modeled trends in allele frequency as a function of time (crossing year). These signatures of selection were used to prioritize variants. Associations between phenotypic performance and individual load of prioritized variants were then investigated. Furthermore, we assessed whether incorporating selection information into a genomic best linear unbiased prediction (GBLUP) model improves model performance in terms of quality of fit and prediction ability. Our findings indicate that the inferred signals of selection are effective in identifying non-neutral variants. Variants under strong negative selection were associated with a decrease in protein content adjusted for grain yield (p-value < 0.01), while genetic variants that had been under moderate to high levels of positive selection were associated with increased grain yield (p-value < 0.01). However, incorporating selection information did not improve prediction accuracy. In conclusion, temporal trends in allele frequency can be used to detect non-neutral variants. The proposed approach may hence complement traditional quantitative genetic methods for detecting non-neutral genetic variation. This approach may allow breeders to detect non-neutral variants earlier in the breeding cycle, without resorting to phenotypic data.

4
The stability of fatty acid composition in sunflower oil is dependent on environment and affected by structural variation

Ingold, M.; Gao, Q.; Mandel, J. R.; McNellie, J. P.; Keepers, K. G.; Barb, J. G.; Burke, J. M.; Rieseberg, L. H.; Hulke, B. S.

2026-05-07 plant biology 10.64898/2026.05.04.722759 medRxiv
Top 0.1%
14.5%
Show abstract

In sunflower (Helianthus annuus L.), the composition of fatty acids in the seeds, primarily oleic, linoleic, stearic and palmitic acid, is of utmost importance for oil quality. Despite this, the genetic basis of this trait and its interaction with the environment is poorly understood. Understanding this interaction is critical to improvement of sunflower within the context of climate change. In this work, we incorporated fatty acid composition measurements from the sunflower SAM population and eight environments across an extensive geographic cline into GWAS. The SAM panel consists of 287 varieties representing approximately 90% of sunflower diversity, for which 2.2 million high-quality SNPs with a MAF > 5% are available. For increased power, multivariate GWAS was performed with four different inputs: (i) mean fatty acid composition within each environment, (ii) mean fatty acid composition within each environment omitting high oleic varieties, (iii) trait stability within environments quantified by standard errors among replicate samples ( stability) and (iv) Eberhart and Russells {beta} which quantifies trait stabilities across environments ({beta} stability). All four analyses yielded highly significantly associated SNPs. We found that high oleic varieties exhibited high {beta} trait stability, resulting in substantial overlap in markers between analyses (i) and (iv), with signals being fairly consistent between environments in analysis (i). For analyses (ii) and (iii), significant markers tended to vary between trials. For significant SNPs across all analyses, 147 candidate genes were identified, including promising candidates such as 15 fatty acid metabolism genes, 6 heat shock proteins and 22 transcription factors. Lastly, a large introgression consisting of two flanking inverted sequences on Chromosome 5 was found to coincide with stability in the Georgia trial, suggesting a role in FA composition stability under high heat conditions.

5
Identifying water stress response haplotypes in barley using latent environmental covariates

Aldiss, Z.; Brunner, S.; Heidariask, B.; Chenu, K.; Van Haeften, S.; Baraibar, S.; Ganesgalingam, D.; Moody, D.; Hickey, L.; Lam, Y.

2026-05-07 plant biology 10.64898/2026.05.04.722807 medRxiv
Top 0.1%
13.8%
Show abstract

PurposeGenotype-by-environment (G x E) interactions represent a major obstacle to increasing genetic gain in crop breeding, with the underlying physiological drivers often remaining obscured within conventional statistical models. This case study presents a novel framework that transforms the latent factors from Factor Analytic (FA) multi-environment trial (MET) models into heritable quantitative traits, enabling the genetic dissection of adaptive response patterns. MethodsA Factor Analytical Linear Mixed Model (FA-LMM) was fit to plot-level yield data for 1,036 barley genotypes across eight Australian trials. ResultsCorrelation of the factor loadings with APSIM-simulated environmental covariates demonstrated that the second latent factor FA2 was strongly correlated with the Water Stress Index (r = -0.83) during the critical flowering period, establishing water availability as the main biological axis of crossover Gx E. Genotypic scores for the derived traits, Overall Performance (OP) and Water Stress Response (WSR), were subjected to high-resolution haplotype-based mapping using local Genomic Estimated Breeding Values (GEBV). ConclusionThis analysis successfully identified major genomic regions that accounted for a substantial proportion of the additive genetic variance. Gene Ontology enrichment of candidate genes within the top haploblocks implicated fundamental pathways related to energy homeostasis, root development, and stress response, with notable candidates including FTsH11, BPS1, and TDP1. The distribution of favourable Haplotypes of Interest (HOI) in elite cultivars suggested a historical signature of inadvertent selection for these adaptive mechanisms. This framework provides an explicit bridge between statistical modelling and functional genomics, offering breeders actionable genetic targets for accelerated development of climate-resilient cereals.

6
Reduction of Pollen Number and Anther Length in Bread Wheat Studied by a Nested Association Mapping Population

Hamaya, N.-B.; Kakui, H.; Okada, M.; Jilu, N.; Jung, K.; Nitta, M.; Wicker, T.; Keller, B.; Nasuda, S.; Shimizu, K. K.

2026-05-23 plant biology 10.64898/2026.05.22.727104 medRxiv
Top 0.1%
12.2%
Show abstract

The number of pollen grains, which carry male gametes in seed plants, has attracted interest in genetics, evolution, and breeding. Rapid evolutionary reductions in pollen number and anther length were reported in selfing species as well as domesticated species, although this poses a challenge for hybrid breeding. Here, we studied the variation of pollen number and anther length of the hexaploid bread wheat (Triticum aestivum) by employing a quick pollen counting method. Pollen numbers in cultivars were lower than those in landraces among 54 lines of diverse geographic origins. Using the year of registration of traditional and modern cultivars, we found a reduction in pollen number over the past 150 years. We detected high heritability and variation among Asian landraces and cultivars. Thus, we conducted QTL mapping of pollen number as well as of anther length using nested association mapping lines in which Norin 61 was the common parent. Genomic loci encompassing Green Revolution genes (Rht-B1, Rht-D1, and Ppd-D1) showed significant effects on pollen number and anther length, but their contributions were relatively minor. Although anther length has often been used as a proxy for pollen number in bread wheat, our data showed that their correlations are not necessarily high. Interestingly, we identified a new QTL of pollen number that was not detected by measuring anther length, and, vice versa, a new QTL specific to anther length. These data suggest that pollen number has reduced rapidly in bread wheat but can be modified using the genetic diversity of landraces. Significance statementWe found that modern cultivars of bread wheat have reduced pollen number and shorter anther length, which are common in domesticated species but can be a challenge for hybrid breeding. Using underutilized Asian landraces and cultivars, we reported that new quantitative trait loci as well as loci used in the Green Revolution, are responsible for the traits, which can be employed to increase pollen numbers.

7
Characterization of genetically effective cells and EMS mutagenesis on the novel winter oil seed Pennycress (Thlaspi arvense)

Brusa, A.; Branch, C.; Sulivan, L.; Chopra, R.; Rai, K.; Rockstad, G.; Gjesvold, E. S.; Ott, M.; Jain, S.; Biel, C. C.; Marks, M. D.

2026-05-05 genomics 10.64898/2026.04.30.722012 medRxiv
Top 0.1%
12.2%
Show abstract

Pennycress (Thlaspi arvense L.) is an intermediate winter oilseed crop that has only recently been domesticated for agronomic use. Improving agronomic traits requires sources of genetic variation, and mutagenesis is frequently used to help overcome the limitations of natural populations. We investigate the impact of Ethyl methanesulfonate (EMS) on genetically effective cells (GECs) to characterize the intra-individual genetic variation of EMS mutagenesis in pennycress. We identified that pennycress contains at least 4 GECs which, when treated with EMS, create unique mutations across different branches within the same individual plant. We then propagated the M2 plants for whole genome sequencing, providing extensive characterization of the EMS mutation profile and developing a gene index as a resource for future reverse genetic screenings. Article SummaryPennycress is an emerging winter oil seed crop in the American Midwest. Domestication efforts have advanced rapidly through a combination of genetic techniques. One of the most successful methods has been the use of a mutant gene index, a large collection of pennycress seed where new genetic variation has been created through Ethyl methanesulfonate (EMS). EMS mutations are not uniform however, and a single treated seed can have wide genetic variation within the resulting plant. We investigate the role of genetically effective cells on EMS variation, and present the full EMS population as a resource for further pennycress domestication efforts.

8
Efficient Optimization of Genotype Pairs for Intercropping using Genomic Prediction and Bayesian Optimization

Kinoshita, S.; Iwata, H.

2026-05-18 genomics 10.64898/2026.05.15.725387 medRxiv
Top 0.1%
9.1%
Show abstract

Intercropping is a promising strategy to improve productivity and sustainability in agricultural systems, but designing effective genotype combinations remains a major challenge owing to the rapid increase in possible pairings as the number of candidate genotypes increases. This creates a practical bottleneck because field evaluation of all combinations is infeasible under realistic resource constraints. Here, we propose a framework that integrates genomic prediction and Bayesian optimization to support efficient decision-making for intercropping system design. Using genome-wide marker data from sorghum and soybean, we simulated intercropping performance across 5,214 genotype pairs under certain genetic architectures, including variation in heritability, correlations between direct and indirect genetic effects, and the contribution of pair-specific interactions. Genomic prediction models incorporating direct and indirect genetic effects substantially improved prediction accuracy compared with models based on direct genetic effects alone, and inclusion of specific mixing ability further enhanced the performance under high-heritability conditions. When coupled with Bayesian optimization, the models rapidly identified superior genotype pairs, requiring fewer evaluation cycles than random or prediction-only search strategies. Acquisition functions that account for predicted uncertainty were most effective in complex scenarios involving interaction effects or negative correlations between direct and indirect effects. These results demonstrate that combining genomic prediction with Bayesian optimization can substantially reduce the experimental burden associated with intercropping design, while improving the efficiency of identifying high-performing genotype pairs. The proposed framework provides a practical approach for prioritizing candidate mixtures in breeding and field evaluation, and contributes to the development of data-driven strategies for sustainable agricultural systems. HighlightsO_LIA data-driven framework was developed to optimize genotype pairs in intercropping. C_LIO_LIModeling indirect effects improved prediction accuracy across genotype pairs. C_LIO_LIPair-specific interactions enhanced prediction under high-heritability conditions. C_LIO_LIBayesian optimization identified superior pairs under limited evaluation capacity. C_LIO_LIThe framework reduces field-testing requirements for intercropping system design. C_LI

9
A novel matrix multiplication framework for modeling genotype-by-environment interaction in genomic prediction

Montesinos-Lopez, O. A.; Montesinos-Lopez, A.; Montesinos-Lopez, J. C.; Crossa, J.; Dreisigacker, S.; Hernandez-Suarez, C. M.; Ortiz, R.

2026-05-15 genetics 10.64898/2026.05.11.724414 medRxiv
Top 0.1%
8.2%
Show abstract

Accurate modeling of genotype-by-environment (GxE) interaction is critical for genomic prediction in plant breeding but remains challenging due to complex interaction structures. Conventional models often use the Hadamard product of genotype and environment covariance matrices to capture joint similarity, which may not fully represent GxE complexity. Here we propose a novel framework that derives covariance structures from the matrix multiplication of genotype and environment kernels, decomposing these into symmetric components incorporated as random effects in mixed models. Evaluated for 11 wheat and rice multi-environment datasets and across, this approach consistently outperformed the traditional Hadamard-based model, improving prediction accuracy by up to 13.2% in Pearsons correlation and enhancing top-selection accuracy. Combining both methods yielded the highest performance, indicating complementary information capture. This framework offers a flexible, interpretable, and computationally feasible extension for modeling GxE interaction, potentially enhancing genomic selection effectiveness under diverse environmental conditions.

10
A weighted multi-trait approach for heterotic grouping of maize inbred lines under Striga infestation and optimum environments

Abubakar, A. M.; Adejumobi, I. I.; Mengesha, W. A.; Meseka, S.; Oyekunle, M.; Ado, S. G.; Bonkoungou, T. O.; Badu-Apraku, B. A.; Derera, J.

2026-05-16 genetics 10.64898/2026.05.15.725596 medRxiv
Top 0.1%
6.9%
Show abstract

Maximum utilization of existing genetic variability in a breeding program depends on the efficient classification of the inbred lines into heterotic groups, particularly under stress conditions. This study applied practical breeding approaches to determine the mode of genetic inheritance for Striga resistance and proposes a weighted heterotic grouping method based on the general combining ability of multiple traits (WHGCAMT) and compares its effectiveness with other existing methods in classifying the inbred lines into heterotic groups in Striga-infested and optimum environments. Using Diallel design IV, 300 crosses were generated from 21 inbred lines and 4 standard testers. The crosses, along with six checks, were evaluated in an 18 x 17 alpha lattice design with two replications at two locations, in both artificial Striga-infested and Striga-free environments. The inbred lines were genotyped using DArTtag SNP markers. Phenotypic and genotypic data were analyzed using R. Analysis of variance revealed significant mean squares for hybrid, general combining ability (GCA), specific combining ability (SCA) and their interactions with environment. Significant positive and negative GCA and SCA effects were detected for grain yield and other measured traits. However, a larger proportion of additive gene action than non-additive gene action was observed for grain yield and most measured traits. The analysis of molecular variance also showed substantial genetic differences within and between clusters. Except for HSCA, the mean grain yield between the inter-group and intra-group hybrids was significant for each method. Pairwise comparison of the inter- and intra-group hybrids of all the methods showed significant differences between the WHGCAMT and all other methods in most cases. WHGCAMT consistently produced higher-yielding inter-group hybrids and lower-yielding intra-group hybrids, achieving breeding efficiency improvements of 55.8%, 4.3%, 15.7%, and 11.4% over the HSCA, HSGCA, HGCAMT and molecular marker methods, respectively, under Striga infestation. Thus, WHGCAMT offers more precise, reliable and biologically meaningful heterotic groups among early-maturing maize inbred lines.

11
Selecting genomes that matter: haplotype-based prioritization for iterative pangenome expansion

Marone, M. P.; Chen, E.; Himmelbach, A.; Haberer, G.; Spannagl, M.; Stein, N.; Mascher, M.

2026-05-18 genomics 10.64898/2026.05.13.724976 medRxiv
Top 0.1%
6.2%
Show abstract

BackgroundAs pangenomes approach saturation, identifying additional genomes that contribute novel sequence information becomes increasingly difficult. Current sample-selection strategies often rely on global diversity metrics or variant counts and do not explicitly account for the composition of an existing pangenome, a limitation that becomes increasingly relevant as pangenomes mature. Here, we present SelHap, a haplotype-based pipeline that uses whole-genome sequencing (WGS) data to prioritize accessions based on their contribution of novel haplotypes relative to a defined background, enabling targeted and iterative pangenome expansion. ResultsWe applied SelHap to the barley pangenome, using 76 assembled genomes as a background to select new accessions from a large WGS panel. Using this approach, we generated chromosome-scale genome assemblies from 19 accessions selected with SelHap and from 17 elite lines selected based on their relevance in historical barley breeding. Across multiple benchmarking scenarios, SelHap-based selection consistently resulted in a greater increase in non-redundant (single-copy) pangenome sequence, demonstrating that prioritizing haplotype novelty relative to an existing background maximizes unrepresented sequence content. ConclusionsBy transforming complex haplotype-clustering outputs into interpretable summaries and ranked candidate lists, SelHap provides a practical framework for targeted pangenome expansion. Beyond sample selection, SelHap can facilitate ancestry and germplasm comparisons across diverse panels. As WGS data become more accessible, SelHap offers a scalable and interpretable solution for extending mature pangenomes by explicitly targeting previously unrepresented sequence space.

12
Methodological pitfalls in plant pangenome gene family identification may lead to biased evolutionary inferences

Liu, S.; Zhang, W.; Yu, P.

2026-05-18 genomics 10.64898/2026.05.15.725319 medRxiv
Top 0.1%
6.2%
Show abstract

Pangenome-level gene family identification often applies sequence similarity clustering without phylogenetic or synteny information, which risks biologically misleading evolutionary inferences. Using five transcription factor families (bHLH, MYB, NAC, WRKY, MADS-box) across 401 rice pangenome accessions, we compared clustering strategies: OrthoFinder alone, cd-hit alone, MMseqs2 alone, and OrthoFinder-informed refinement by cd-hit or MMseqs2. Methods solely based on sequence similarity merged distinct orthogroups and generated fewer orthogroups than approaches incorporating graph-based orthology. Conflicting cluster assignments, measured against OrthoFinder, varied strongly among families, from approximately 14% in MADS-box to approximately 57% in MYB, and were associated with protein length differences. Core, shell, and cloud gene classifications shifted substantially depending on the method, especially in MYB, NAC, and WRKY families. Critically, Ka/Ks distributions for core genes were highly method-sensitive, with orthology-aware methods yielding more convergent and less variable estimates of selective pressure, whereas noncore gene estimates remained robust. These findings demonstrate that neglecting graph-based orthogroup inference inflates methodological artifacts. We recommend a two-step strategy: initial graph-based orthogroup delineation followed by sequence similarity refinement to balance evolutionary accuracy and resolution in pangenome-scale gene family studies.

13
Mapping of Stripe Rust and Leaf Rust Resistance Genes in the Hard Red Winter Wheat Population Green Hammer/Lonerider

Sharma, R.; Wang, M.; Chen, X.; Carver, B. F.; Guttieri, M.; St. Amand, P.; Bernardo, A.; Bai, G.; Liu, S.; Ara, A. M.; Aoun, M.

2026-05-15 genetics 10.64898/2026.05.13.724876 medRxiv
Top 0.1%
4.8%
Show abstract

Stripe rust and leaf rust, caused by Puccinia striiformis f. sp. tritici and P. triticina, respectively, are the most destructive wheat diseases in the southern Great Plains. Green Hammer is a hard red winter wheat (HRWW) cultivar released by Oklahoma State University in 2018 and has demonstrated a stable adult plant resistance to stripe rust and race-specific seedling resistance to leaf rust. To identify and map rust resistance loci, 109 doubled haploid (DH) lines derived from the cross between Green Hammer and another HRWW cultivar, Lonerider, were developed. Lonerider showed adult plant resistance to stripe rust but was susceptible to multiple P. triticina races. The DH lines were evaluated for stripe rust at the adult plant stage in greenhouse and field environments across Oklahoma, Kansas, and Washington, and for leaf rust at the seedling stage against seven U.S. P. triticina races and at the adult plant stage in Oklahoma and Texas. Genotyping-by-sequencing generated 6,078 polymorphic single-nucleotide polymorphisms used for genetic mapping. Quantitative trait loci (QTL) analysis identified 14 stripe rust and 8 leaf rust resistance QTL. For stripe rust, a major QTL in Green Hammer, QYr.osughln-2AS, was identified in the proximity of the 2NvS translocation. Three other major stripe rust resistance QTL were identified in Lonerider on chromosomes 2AL (two QTL) and 2BS (one QTL). For leaf rust, QLr.osughln-1DS and QLr.osughln-2DS.1 were the two major QTL identified in Green Hammer and most likely correspond to the all-stage resistance genes Lr21 and Lr39, respectively. In this study, we identified previously characterized genes as well as unknown genes that can be utilized in wheat breeding programs to enhance resistance to leaf rust and stripe rust.

14
Genomic and Transcriptomic Basis of Salinity Tolerance in Dry Pea

Acharya, S. R.; Bredu, E.; Navasca, H.; Worral, H.; Piche, L.; Saludares, R. A.; Johnson, J. P.; Coyne, C.; Mcphee, K.; Zhang, Q.; Ostlie, M.; Bandillo, N.

2026-05-08 genetics 10.64898/2026.05.05.722931 medRxiv
Top 0.1%
4.8%
Show abstract

Salinity is a major crop production constraint in dry pea (Pisum sativum L.), making the development of salt-tolerant varieties essential to improve crop productivity and land-use efficiency. The genetic mechanisms of salt tolerance in dry pea is largely unknown, and research on salt-tolerant genes is limited. In this study, we established comprehensive genomic and transcriptomic resources, along with a robust screening protocol, to dissect the genetic basis of salinity tolerance using two germplasm sets: the USDA pea diversity panel, consisting of approximately 200 globally sourced accessions, and a set of 300 modern elite lines from the NDSU Pulse Crops Breeding Program. Genetic variation for the salinity response was assessed based on ten phenotypic traits, with root dry weight, shoot dry weight, and specific root length identified as key indicators based on their heritability. Genome-wide association mapping uncovered significant genomic regions and several candidate genes linked to salt stress, with the strongest association found on chromosome 6. Overlapping QTL signals across traits suggest a shared genetic architecture underlying salinity tolerance. Field-based transcriptomic analysis further identified five putative genes involved in salinity response conserved across multiple crop species. Notably, Psat5g000800, encoding a glycosyl hydrolase gene, was markedly upregulated under salinity stress. These findings highlight the complex, multi-gene regulatory nature of salinity tolerance in dry pea and underscore the importance of functional validation of candidate genes. This study provides key insights and practical tools to support breeding efforts aimed at improving salt tolerance in dry pea.

15
An aromatic substrate prenyltransferase involved in the chemical diversification of flavonoids in Glycyrrhiza glabra

Kubomura, A.; Arai, T.; Han, J.; Munakata, R.; Yasuno, N.; Kobayashi, O.; Mamiya, K.; Nakamuta, K.; Wasano, N.; Yazaki, K.; Ohara, K.

2026-05-15 molecular biology 10.64898/2026.05.12.724477 medRxiv
Top 0.2%
4.1%
Show abstract

Prenylated isoflavonoids are widely distributed specialized metabolites within the Fabaceae and contribute to various characteristic biological activities for both plants and humans. Several aromatic prenyltransferases (PTs) have been identified in Glycyrrhiza species, which are the most widely consumed crude drugs in traditional Chinese medicine. However, these enzymes do not sufficiently explain the structural diversity of prenylated flavonoids produced in the Glycyrrhiza genus. To identify additional novel PTs, we used elicited cultured Glycyrrhiza glabra roots as source material, in which elicitor treatment of cultured roots increased the accumulation of multiple prenylated flavonoids. To identify the responsible enzyme, PT candidates were screened using G. uralensis transcriptomes, currently the sole publicly available transcriptomic resource within the genus, and a homolog designated GgBSPT1 (BSPT; a broad-substrate prenyltransferase) was subsequently isolated from elicited cultured G. glabra roots. GgBSPT1 differed from previously identified Glycyrrhiza PTs in both amino acid sequence and enzymatic properties. GgBSPT1 catalyzed 3'-prenylation of isoliquiritigenin and 6-prenylation of five flavonoids, i.e., this PT displayed broad substrate acceptance across 20 distinct flavonoid structures. Overall, elicited cultured G. glabra roots enabled the identification of a previously unrecognized PT that is functionally distinct from earlier reported Glycyrrhiza PTs. This study provides a new insight into the metabolic plasticity of Glycyrrhiza species and expands the enzymatic toolkit for future metabolic engineering of prenylated phytochemicals by the unusually broad substrate specificity of GgBSPT1. Main conclusionUsing cultured Glycyrrhiza glabra roots, we identified a new prenyltransferase involved in the formation of a variety of flavonoids, thereby revealing novel prenylated isoflavonoid pathways in licorice.

16
Novel linkage disequilibrium-based genotype-by-environmental interaction method for genomic prediction of cotton yield and fibre quality traits

Li, Z.; Li, X.; Liu, S.; Wilson, I.; Zhu, Q.-H.; Stiller, W.; Conaty, W.

2026-05-06 plant biology 10.64898/2026.05.03.722538 medRxiv
Top 0.2%
4.0%
Show abstract

Genomic prediction (GP) across diverse environments has a potential to accelerate genetic gain in cotton breeding programs. A major challenge in GP is modelling genotype-by-environment interactions (GEI), which is essential for selecting stable and high-performing genotypes under variable production conditions. However, incorporating GEI into GP models increases the dimensionality and computational complexity, risking complex models that are impractical to use on commercial breeding-scale data sets because of run times and computational demands. This study addresses two primary aims. Firstly, we evaluate the practical benefits of GEI-informed GP for predicting economically important cotton traits. Second, advanced statistical modelling strategies are developed and assessed for integrating genomic and environmental data at scale. We propose a dimensionality reduction approach that combines linkage disequilibrium network analysis with principal component techniques to reduce redundancy while preserving informative variation. Using this reduced dataset, we implement Bayesian linear regression models and, for comparison, deep residual neural networks for genomic prediction. Analyses were conducted on a large multi-environment dataset from the CSIRO cotton breeding program, comprising 3,236 breeding lines, 54 environmental covariates, and 8,049 yield and fibre quality phenotype records collected over 10 years and 9 locations representing 41 year-location combinations. Results demonstrate that generally Bayesian linear regression approaches outperform BG-BLUP models, with all three linear/linear mixed methods providing clearly more reliable performance than the deep learning models. These findings highlight the value of using interpretable statistical models for integrating genomic and environmental information to support selection decisions under diverse environmental conditions.

17
Extending the seasons at both ends? Understanding the physiological and genetic context required for stay green mediated yield increase in wheat (Triticum aestivum)

Chapman, E. A.; Orford, S.; Beeby, R.; Lage, J.; Griffiths, S.

2026-05-23 plant biology 10.64898/2026.05.22.727135 medRxiv
Top 0.2%
3.9%
Show abstract

Flowering time and monocarpic senescence are tightly environmentally and genetically controlled. Typically, early flowering and staygreen traits are associated with opposing life-history strategies; stress avoidance versus adaptation; with flowering time an overarching regulator of crop cycle length. We developed RIL populations segregating for Ppd-1 and NAM-1 variation, which are otherwise isogenic. Multi-year field experiments enabled exploration and uncoupling of the relationship between heading and staygreen traits. Heading date manipulation enabled introduction of staygreen traits to their target breeding environments, characterised by a hot-finish. Under moderate stress, we report a 2.9% and 1.9% increase in grain width (P<0.0001), and 5.8% and 3.7% increase in TGW (P<0.0001), plus significantly greater yield (P<0.1) for late heading staygreen RILs homozygous for NAM-A1, and NAM-D1 missense variants, respectively. Grain yield increases were proportionate to the delay in senescence, being greater for the NAM-A1 than the NAM-D1 variant. For RIL populations segregating for both traits, senescence variation was observed relative to heading-date. Regarding grain yield, the staygreen trait-associated increase in source size could not compensate for the Ppd-1a associated pleiotropic reduction in sink size, even under hypothesised continental target breeding environments, with trait competition identified. Therefore, to maximise the benefits associated with staygreen traits, especially in early-heading favouring environments required targeted manipulation of source-sink dynamics, and we propose multiple strategies. HighlightStaygreen traits were associated with extending grain fill duration, increasing grain width, TGW and grain yield. There appears an antagonist relationship between earlier heading and staygreen traits.

18
Selection For Yield Enhanced Rhizobial Mutualism In Pea

Porter, S.; Millar, N.; Coyne, C.

2026-05-18 plant biology 10.64898/2026.05.15.725492 medRxiv
Top 0.2%
3.7%
Show abstract

Crop improvement can enhance food security, but side effects, such as trade-offs between valuable agronomic traits, are common. Likewise, fertilisation helps ensure high yields, but can devalue mutualisms with soil microbes that would otherwise be essential for nutrient acquisition. If the need for nutritional mutualisms is reduced in crops, mutualisms could be disrupted by selection relaxation or allocation trade-offs. Thus, crops could achieve high yields in spite of, or because of, disruption of nutritional mutualisms. Alternatively, the highest-yielding plants might flourish because they maximise nutrient acquisition from both symbionts and the soil. Here, enhanced mutualism could evolve over the course of agricultural crop improvement. To investigate whether high yields in cultivars and wild accessions are negatively or positively genetically correlated with outcomes in the legume-rhizobia mutualism, we measured whether yield and symbiosis traits trade-off or are positively genetically correlated among cultivars and wild accessions. We also tested whether this relationship differs between accessions released before or after 1950. We measured genetic correlations between yield and mutualism traits in 87 domesticated pea (Pisum sativum) accessions in a common garden agricultural field across three years. Seed yield and N2 fixation (%Ndfa) were positively genetically correlated. While N fixation was more strongly predictive of yield in the pre-1950 accessions than the post-1950 accessions, the underlying positive genetic correlation between the traits did not differ between the groups. The positive genetic correlation between yield and N2 fixation indicates that selection to increase yields has maintained or increased the benefits of the rhizobial mutualism in pea. Our findings predict that breeding to increase yield may continue to produce pea cultivars that get a greater proportion of their N from rhizobia, enhancing symbiotic mutualism and reducing the proportion of N supplied by fertilisation.

19
Wheat MYB transcription factor TaMYB83-7B regulates seed dormancy by influencing the balance between abscisic acid and gibberellin

Zhuang, Q.; Cao, S.; Zhang, L.; Wang, H.; Li, W.; Wang, Z.; Zhu, G.; Lu, W.; He, C.; Gao, W.; Chen, C.; Ma, C.; Zhang, H.; Chang, C.

2026-05-21 molecular biology 10.64898/2026.05.19.726193 medRxiv
Top 0.2%
3.6%
Show abstract

In wheat, weak seed dormancy (SD) is related to an increased tendency for pre-harvest sprouting (PHS), which reduces yield and quality. However, the molecular mechanism underlying SD remains elusive. Here, we identified a wheat R2R3-MYB transcription factor (TaMYB83-7B) related to SD. Expression analysis showed that TaMYB83-7B was highly expressed in wheat seeds, and was more highly expressed in strong-dormancy varieties than in weak-dormancy varieties. Sequence and association analysis indicated that T/C mutations at -907 bp and -1133 bp in the TaMYB83-7B promoter were significantly associated with wheat SD, with C at both sites related to strong dormancy. Dual-luciferase reporter assays demonstrated that the transcriptional activity of the TaMYB83-7B promoter was significantly higher in strong-dormancy varieties than in weak-dormancy varieties. Further analyses indicated that TaMYB83-7B functions as a transcriptional inhibitor. Germination experiments revealed that overexpression of TaMYB83-7B significantly enhanced SD, while its loss-of-function reduced SD. Finally, TaMYB83-7B was found to regulate SD by influencing the balance between abscisic acid (ABA) and gibberellin (GA) in wheat seeds. Overall, the results of this study enhance our understanding of the complex regulatory mechanism underlying SD, and provide gene targets and molecular markers for the genetic improvement of PHS resistance in wheat.

20
Functional genomic map of local adaptation in sorghum to guide allele mining

Xu, Y.; Das, A.; Cruet-Burgos, C.; Morris, G. P.; Lasky, J. R.

2026-05-18 evolutionary biology 10.64898/2026.05.17.725773 medRxiv
Top 0.2%
3.5%
Show abstract

Genomic data from genebanks could be exploited to find alleles adapted to target environments for resilience breeding, but it can be difficult to prioritize among the thousands of accessions and millions of genomic variants. There are competing hypotheses for the molecular basis and architecture of local adaptations: e.g. whether cis-regulatory versus amino acid changing variants are more important; or whether small-effect, low pleiotropy versus large-effect, high pleiotropy variants are more important. Here, we compare a range of variant types and genomic contexts thought to influence effect size, pleiotropy, and selection for their role in local adaptation in 443 whole genome resequenced African sorghum landraces. We used genotype-environment associations (GEAs) as evidence of local adaptation. We found that GEA were particularly enriched in the vicinity of genes and depleted elsewhere. However, enrichment was strongest in likely cis-regulatory contexts: accessible chromatin, unmethylated regions, and in transposable elements close to genes. Near genes, there were clear peaks in GEAs at the transcription start site, where mutations are demonstrated to have the largest expression effects. Additionally, GEAs in accessible chromatin and unmethylated regions were better predictors of genetic variation in response to experimental drought than comparable loci. Having tested hypotheses about the variants underlying local adaptation, we can now use this knowledge of the importance of cis-regulatory variation in the search for new environmentally-adaptive alleles for plant improvement.